Serine protease variants having peptide ligase activity

ABSTRACT

The invention relates to serine protease variants derived from precursor serine proteases via recombinant and/or chemical methods to form protease variants having improved peptide ligase activity. The invention also includes novel ligation substrates which in combination with the serine protease variants and a second ligation substrate are capable of forming a ligation product. The invention also relates to methods for forming such ligation products and the products formed thereby.

This is a continuation-in-part of U.S. patent application Ser. No.07/566,026, filed Aug. 9, 1990, now abandoned.

TECHNICAL FIELD OF THE INVENTION

The invention relates to serine protease variants derived from precursorserine proteases via recombinant and/or chemical methods to formprotease variants having improved peptide ligase activity. The inventionalso includes novel ligation substrates which in combination with theserine protease variants and a second ligation substrate are capable Offorming a ligation product. The invention also relates to methods forforming such ligation products and the products formed thereby.

BACKGROUND OF THE INVENTION

Chemical approaches for synthesis and engineering of proteins offer manyadvantages to recombinant methods in that one can incorporatenon-natural or selectively labelled amino acids. However, peptidesynthesis is practically limited to small proteins (typically <50residues) due to the accumulation of side-products and racemization thatcomplicate product purification and decrease yields (for recent reviewssee Kaiser, E. T. (1989) Acc. Chem. Res. 22, 47-54; Offord, R. E. (1987)Prot. Eng. 1, 151-157).

Proteolytic enzymes, in particular serine proteases, have reportedlybeen used as alternatives to synthetic peptide chemistry because oftheir stereoselective properties and mild reaction conditions (forreviews see Kullman, W. (1987) In: Enzymatic Peptide Synthesis, CRCPress, Florida U.S.; Chaiken, (1981) CRC Crit. Rev. Biochem. 11,255-301). Such enzymes reportedly have been used to complement chemicalcoupling methods to produce larger peptides by blockwise enzymaticcoupling of synthetic fragments. Inouye et al. (1979), J. Am. Chem.Soc., 101, 751-752 (insulin fragments); Hommandberg and Laskowski,(1979) Biochemistry 18, 586-592 (ribonuclease fragments)). However, thenarrow substrate specificities and intrinsic hydrolytic (peptidase)activity of serine proteases have limited their use in peptidesynthesis.

A central problem in the case of serine proteases in peptide synthesisis that hydrolysis of the acyl-enzyme intermediate is strongly favoredover aminolysis (FIG. 1). Several laboratories have reported that theequilibrium is shifted from hydrolysis toward aminolysis by use of mixedor pure organic solvents to carry out catalysis (Coletti-Previero etal., (1969) J. Mol. Biol. 39, 493-501; Barbas et al., (1988) J. Am.Chem. Soc. 110, 5162-5166). However, enzymes are generally less stableand relatively insoluble in organic solvents (Wong et al., (1990) J. Am.Chem. Soc. 112, 945-953; Klibanov, (1986) Chemtech 16, 354-359).Further, kinetic activation barriers in organic solvents are higher forthe charged transition-states involved leading to lower enzymaticactivity. In an attempt to avoid these problems, one laboratory reportedthat thiolsubtilisin, a derivative of the bacterial serine protease inwhich the active site Ser221 was chemically converted to a Cys (S221C),shifted the preference for aminolysis to hydrolysis by >1000-fold forvery small peptides. Nakasuta et al. (1987) J. Am. Chem. Soc. 109,3808-3810.

This shift was attributed to the kinetic preference of thioesters toreact with amines over water. Based upon similar principles, anotherlaboratory reported that selenolsubtilisin had a 14,000-fold shift inpreference for aminolysis over hydrolysis. Wu and Hilvert (1989) J. Am.Chem. Soc. 111, 4513-4514. However the catalytic efficiencies foraminolysis of a chemically activated ester by either thiol- orselenolsubtilisin are about 10³ - and 10⁴ -fold, respectively, below theesterase activity of wild-type subtilisin. Although chemically activeesters have reportedly been used to increase the rates for acylation ofthiol- or selenolsubtilisin (e.g. the acylation of thiolsubtilisin witha p-chlorophenyl ester of an 8-mer peptide for ligation with a 4-merpeptide in >50% DMF), such activated esters present syntheticdifficulties as well as creating substrates prone to spontaneoushydrolysis in aqueous solvents (Nakatsuka et al. (1987) supra.).

The serine proteases comprise a diverse class of enzymes having a widerange of specificities and biological functions. Stroud, R. M. (1974)Sci Amer. 131, 74-88. Despite their functional diversity, the catalyticmachinery of serine proteases has been approached by at least twogenetically distinct families of enzymes: the Bacillus subtilisin-typeserine proteases and the mammalian and homologous bacterial trypsin-typeserine proteases (e.g., trypsin and S. gresius trypsin). These twofamilies of serine proteases show remarkably similar mechanisms ofcatalysis. Kraut, J. (1977) Ann. Rev. Biochem., 46, 331-358.Furthermore, although the primary structure is unrelated, the tertiarystructure of these two enzyme families bring together a conservedcatalytic triad of amino acids consisting of serine, histidine andaspartate.

Subtilisin is a serine endoprotease (MW⁻ 27,500) which is secreted inlarge amounts from a wide variety of Bacillus species. The proteinsequence of subtilisin has been determined from at least four differentspecies of Bacillus. Markland, F. S., et al. (1971) in The Enzymes, ed.Boyer, P. D., Acad. Press, New York, Vol. III, pp. 561-608; Nedkov, P.et al. (1983) Hoppe-Seyler's Z. Physiol. Chem., 364, 1537-1540. Thethree-dimensional crystallographic structure of subtilisin BPN' (from B.amyloliquefaciens) to 2.5Å resolution has also been reported. Bott, etal. (1988), J. Biol. Chem., 263, 7895-7906; McPhalen, et al. (1988),Biochemistry, 27, 6582-6598; Wright, C. S., et al. (1969), Nature,221,235-242; Drenth, J. et al. (1972) Eur. J. Biochem.,26,177-181. Thesestudies indicate that although subtilisin is genetically unrelated tothe mammalian serine proteases, it has a similar active site structure.The x-ray crystal structures of subtilisin containing covalently boundpeptide inhibitors (Robertus, J. D., et al. (1972), Biochemistry, 11,2439-2449), product complexes (Robertus, J. D., et al. (1972)Biochemistry 11, 4293-4303), and transition state analogs (Matthews, D.A., et al (1975) J. Biol. Chem. 250, 7120-7126; Poulos, T. L., et al.(1976) J. Biol. Chem. 251, 1097-1103), which have been reported havealso provided information regarding the active site and putativesubstrate binding cleft of subtilisin. In addition, a large number ofkinetic and chemical modification studies have been reported forsubtilisin (Philipp, M., et al. (1983) Mol. Cell. Biochem. 51, 5-32;Svendsen, I. B. (1976) Carlsberg Res. Comm. 41, 237-291; Markland, F. S.Id.). Stauffer, D. C., et al. (1965) J. Biol. Chem. 244, 5333-5338;Polgar, L. et al. (1981) Biochem. Biophys. Acta 667,351-354).

U.S. Pat. No. 4,760,025 discloses subtilisin mutants wherein a differentamino acid is substituted for the naturally-occurring amino acidresidues of Bacillus amyloliquifaciens subtilisin at positions +32,+155, +104, +222, +166, +64, +33, +169, +189, +217, or +156.

The references discussed above are provided solely for their disclosureprior to the filing date of the present case, and nothing herein is tobe construed as an admission that the inventors are not entitled toantedate such disclosure by virtue of prior invention or priority basedon earlier filed applications.

SUMMARY OF THE INVENTION

Based on the foregoing, it is apparent that the chemical synthesis oflarge peptides and proteins is severely limited by the availablechemical synthesis techniques and the lack of an efficient peptideligase which is capable of coupling block synthetic or recombinantpeptides.

Accordingly, it is an object herein to provide serine protease variantswhich are capable of efficiently ligating peptides and other substrates.

Further, it is an object herein to provide ligation substrates whichwhen used in combination with the aforementioned serine proteasevariants are capable of ligating a first ligation substrate with asecond ligation substrate to form a ligation product.

Further, it is an object herein to provide methods for producingligation products using serine protease variants and ligationsubstrates.

Further, it is an object herein to provide ligation products made by theaforementioned methods.

In accordance with the foregoing objects, the invention includes serineprotease variants having an amino acid sequence not found in naturewhich are derived from a precursor serine protease by at least twochanges in the amino acid residues of the precursor serine protease. Inparticular, the precursor serine proteases are characterized by acatalytic serine residue which participates in the catalysis normallycarried out by the precursor enzyme. This active site serine residue isreplaced with a different amino acid to substitute the nucleophilicoxygen of the serine side chain with a different nucleophile.Alternatively, the side chain of the active site serine may be directlymodified chemically to substitute a different nucleophile for thenucleophilic oxygen. The second change comprises replacement ormodification of a second amino acid residue not consisting of the activesite serine of the precursor enzyme. Such replacement or modification ofthe second amino acid residue, in combination with the replacement ormodification of the active site serine, produces a serine proteasevariant characterized by a peptide ligase activity measured in aqueoussolution which is greater than that of a different serine proteasevariant containing only the substitution or modification of thenucleophilic oxygen at the active site serine.

The invention also includes ligation substrates which are useful incombination with the aforementioned serine protease variants. In variousprecursor serine proteases, the enzyme catalyzes the cleavage of apeptide bond (the scissile peptide bond). The standard designation ofprotease hydrolysis substrate residues using the nomenclature ofSchechter and Berger (1967) Biochem. Biophys. Res. Commun. 27, 157-162and the scissile bond hydrolyzed is shown in FIG. 2A. Since peptideligation is essentially the reversal of hydrolysis, ligation substratesare defined similarly as depicted in FIG. 2B. Thus, in one aspect of theinvention, first ligation substrates comprise at least an R1 amino acidresidue with the carboxy terminus of the R1 residue esterfied with anorganic alcohol (e.g. a 2-hydroxy carboxylic acid) or thiol. The R1residue comprises those amino acid residues R1 which preferentially bindto the precursor serine hydrolase or which preferentially bind to thoseserine hydrolase variants which have been further modified to altersubstrate specificity at the P1 position. Such R1 residues also comprisenon-naturally occurring amino acids for which the variant hasspecificity. In addition, the esterified 2-hydroxy carboxylic acidclosely resembles the P1' residue in substrates for the precursor serineprotease or the residues preferred by those serine protease variantswhose specificity has been modified at the P1' position.

The invention also includes ligation methods wherein the serine proteasevariant of the invention is contacted with a first and a second ligationsubstrate to form a ligation product. The first ligation substratecomprises the aforementioned ligation substrate (FIG. 2B). The secondligation substrate comprises at least an R1' amino acid residue forwhich the serine protease variant has specificity (FIG. 2C). It may alsocomprise non-naturally occurring amino acids for which the variant hasspecificity. The ligation product so formed by a ligation of the firstand second ligation substrate contains the sequence R1-R1' (FIG. 2D).

In addition, the invention includes ligation products made by theaforementioned method. Such products are typically have a length greaterthan the length of about 17 amino acid residues. Such ligation productsare also characterized by the ligation method which may be carried outin, but not limited to, aqueous solution.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts the kinetic aspects of peptide ligation. In Equation 1,the ester bond is cleaved by the enzyme, the alcohol group leaves andthe acid becomes acylated to the active site nucleophile to form anacyl-enzyme intermediate (acyl-enzyme). Two possible reactions of theacyl-intermediate are hydrolysis (Equation 2) and aminolysis (Equation3). Hydrolysis is nonproductive whereas aminolysis leads to adductformation. The amidase activity of the enzyme (Equation 4) results incleavage of the adduct and reformation of the acyl-enzyme intermediate.

FIG. 2A depicts the protease substrate residues designated using thenomenclature of Schechter and Berger (1967) Biochem. Biophys. Res.Commun., 27 157-162. FIGS. 2B and 2C depict first and second ligationsubstrates used in peptide ligation. The various residues R2, R1, R1'and R2' comprise amino acid residues or analogues to such residues forwhich the serine protease variant or precursor enzyme has specificity,e.g. those amino acid residues comprising P2, P1, P1' and P2'. Thecarboxy terminus of the first ligation substrate contains Group X whichis a 2-hydroxy carboxylic acid esterified to the C-terminal carboxylgroup of the substrate used to form the first ligation substrate. Theresidues R2", etc. comprise amino acid residues or analogous to suchresidues for which the protease variant or precursor enzyme hasspecificity, e.g. P2', etc. FIG. 2D depicts the ligation product formedby ligation of the first and second ligation substrates (FIGS. 2B and2C) by the serine protease variants of the invention.

FIG. 3 depicts the catalytic residues of Bacillus amyloliquefacienssubtilisin including the catalytic triad Ser221, His64 and Asp32. Thetetrahedral intermediate transition state is shown as ES⁺.

FIG. 4 depicts the tertiary structure of subtilisin from Bacillusamyloliquefaciens subtilisin. α-helix of subtilisin associated with thecatalytic residue Ser221 is highlighted.

FIG. 5 is a stereo view of the α-helix of Bacillus amyloliquefacienssubtilisin associated with the catalytic Ser 221.

FIGS. 6a-6b depict the amino acid sequence for subtilisin from Bacillusamyloliquefaciens, Bacillus subtilis VarI168 and Bacillus licheniformisusing standard single letter designations for naturally occurring aminoacids.

FIG. 7 is a stereo view overlay of wild-type (open bar) and P225A (solidbar) Bacillus amyloliquefaciens subtilisin in the region containingα-helix containing Ser221 and position 225.

FIG. 8 depicts an active site view showing a peptide substrate in thetransition state bound to the active site of wile-type subtilisin(residues P4 through P2'). The hydrolysis substrate is bound in anoutstretched β-sheet like conformation with the main chain carbonyloxygens and amides involved in hydrogen bonds with corresponding groupsof the enzyme. Specifically, the residues S1-S4 form the central strandof a three-stranded antiparallel β-sheet (McPhalen et al., (1988)supra.). Some of the more important side chain interactions on eitherside of the scissile bond are emphasized. Other residues affectingsubstrate specificity, e.g. Glu156 and Gly166 for P1 specificity, arenot shown.

FIG. 9 depicts the general structure and specific 2-hydroxy carboxylicacids that are useful in making first ligation substrates that mimic apeptide substrate.

The structural relationship between specific 2-hydroxy and amino acidsis also depicted.

FIGS. 10A, 10B and 10C depict progress curves showing the consumption ofpeptide ester substrate (succinyl-Ala-Ala-Pro-Phe-glycolate-Phe-amide ata starting concentration of 0.35mM) and the concurrent appearance ofhydrolysis and aminolysis products using the dipeptide Ala-Phe-amide(3.6mM) as the nucleophile (acyl-acceptor) with different subtilisinvariants. The same amount of enzyme (10 μM) was used in each case underidentical conditions, pH 8.0, (25°±0.2° C). Aliquots were taken avarious times and analyzed by RP-HPLC. The subtilisin variant wasPro225Ala in FIG. 10A, Ser221Cys in FIG. 10B, and Ser221Cys/Pro225Ala inFIG. 10C (Aminolysis ( ), Hydrolysis (∘)).

FIGS. 11A-11D are a comparison of subtilisin variants having differentP-1 specificities with each of four alternative first ligationsubstrates. Progress curves show the consumption of substrate turninginto hydrolysis- and aminolysis products in the presence of 1.5 mM ofthe nucleophile ligation peptide Ala-Phe-amide. The substrates areabbreviated to their tetra peptide in the one-letter form, i.e., AAPF iss-Ala-Ala-Pro-Phe-glycolate-Phe-amide. In each case, aminolysispredominates over hydrolysis.

FIG. 12 shows reducing SDS-PAGE of time course aliquots of growthhormone ligations using either a first ligation substrate ester of thenatural first eight amino acids or a first ligation peptide modified tobecome more suited for the wild type specificity of subtilisin. Thesecond ligation substrate was des-octa hGH. Lane 1, Low molecular weightstandard; Lanes 2 and 9, non-ligated des-octa hGH; Lanes 3-7, aliquotsfrom the reaction using the G166E/S221C/P225A version of the peptideligase with the peptide ester of the natural N-terminus, after times 1,10, 20, 40 and 80 minutes; Lane 8, wild type growth hormone; Lane 10-14,aliquots from the reactions using the S221C/P225A form of the ligasewith the alternative peptide ester, after times 1, 10, 20, 40 and 80minutes. In all lanes containing the des-octa mutant hGH, two smallerbands can be seen below the major band. These are the two chainsresulting from a proteolytic cut during the expression and purificationfrom E. coli. Furthermore, in all time points aliquots of the subtilisinvariant may be seen at an apparent weight of about 30 kDa.

FIG. 13 depicts the method of construction of DNA encoding thereplacement of proline at position 225 Bacillus amyloliquefacienssubtilisin with alanine.

FIG. 14 shows reducing SDS-PAGE of time course aliquots of Protropinligations. A nine residue peptide was ligated onto Protropin using theSer221Cys/Pro225Ala version of the peptide ligase. Lane 1, low molecularweight standard; Lane 2, subtilisin; Lane 3, unligated Protropin; Lanes4-9, hGH ligated with a peptide after 1, 2, 5, 10, 30 and 60 minutes.

DETAILED DESCRIPTION

As used herein, "serine protease" refers to a protease which contains atleast one catalytically active serine residue. Such serine proteases areubiquitous being found in both procaryotic and eucaryotic organisms. Acommon characteristic of many serine proteases, such as thesubtilisin-type and trypsin-type serine protease, is the presence of acommon catalytic triad comprising aspartate, histidine and serine. Inthe subtilisin-type proteases, the relative order of these amino acids,reading from the amino to carboxy terminus isaspartate-histidine-serine. In the trypsin-type proteases, the relativeorder, however, is histidine-aspartate-serine. Notwithstanding thisrelative sequence orientation of the catalytic residues, the secondaryand tertiary structure of serine proteases bring these three catalyticresidues into close proximity to form the catalytically active site.

FIGS. 3 and 5 depict the catalytic residues of Bacillusamyloliquefaciens subtilisin. The interactions of serine 221 insubtilisin to form a tetrahedral transition state with the carbon of thescissile peptide bond is shown in FIG. 3. An acyl-enzyme intermediatebetween the catalytic serine and the carboxy portion of the substrate isformed after the carboxy terminal portion of the substrate leaves theactive site. Hydrolysis of the acyl-enzyme intermediate releases theamino terminal portion of the cleaved peptide from the enzyme andrestores the serine alcohol group.

In subtilisin-type serine proteases, the OG group of the catalytic sidechain of serine (serine-221 in subtilisin) is located near the aminoterminus of a long central α-helix which extends through the molecule.Bott, et al. (1988) J. Biol. Chem. 263, 7895-7906. In Bacillusamyloliquefaciens subtilisin this α-helix comprises methionine 222through lysine 237. This helix is conserved in evolutionarily relatedsubtilisin-type serine proteases but is not found in the catalytic sitesof trypsin-type serine proteases. McPhalen, et al. (1988) Biochemistry27, 6582-6598. The α-helix associated with the active site ofsubtilisin-type serine proteases has led to the suggestion that thedipole of this helix may have a functional role in catalysis. Hol, W. G.J.(1985) Prog. Biophys. Molec. Biol. 45, 149-195. The lack of α-helix atthe active site of the trypsin-type serine proteases, however, hasraised the unresolved question of whether the active site helix ofsubtilisin-type serine proteases is of any significance in catalysis.Hol (1985) supra.

In FIG. 4, the α-helix of Bacillus amyloliquefacien subtilisin togetherwith the catalytic serine 221 is shown as it relates to the tertiarystructure of the enzyme. A stereo view of this α-helix associated withthe catalytic serine 221 is shown in FIG. 5.

The amino acid sequence for subtilisin from Bacillus amyloliquefaciens,Bacillus subtilis VarI168 and Bacillus licheniformis is shown in FIG. 6.The α-helix in Bacillus amyloliquefaciens subtilisin extends from Met222Lys237. The corresponding (equivalent) α-helix in Bacillus subtilis isthe same and in Bacillus lichenformis subtilisin comprises Met221 toLys236. The corresponding (equivalent) residue of Proline 225 inBacillus amyloliquefaciens subtilisin is also Proline 225 in Bacillussubtilis and Proline 224 in Bacillus licheniformis subtilisin. Thecatalytic serine in Bacillus subtilis is at position 221 and at position220 in Bacillus lichenformis subtilisin.

As used herein, a "precursor serine protease" refers to a naturallyoccurring serine protease and recombinant serine proteases. Examples ofnaturally occurring precursor serine proteases include the bacterialsubtilisins from Bacillus amyloliquefaciens, Bacillus licheniformis,Bacillus amylosaccaridicus, and homologous serine protease from fungi,plant and higher animal species, trypsin-type serine proteases frombacteria fungi, plant, animal and virus including trypsin, chymotrypsin,α-lytic protease, elastase, plasminogen, thrombin, tissue plasminogenactivators and homologs thereof.

"Recombinant serine proteases" refers to serine proteases in which theDNA sequence encoding the naturally occurring serine proteases modifiedto produce a variant DNA sequence which encodes the substitution,insertion or deletion of one or more amino acids in the serine proteaseamino acid sequence. Suitable modification methods are disclosed hereinand in EPO Publication No. 0 130 756 published January 9, 1985 and EPOPublication No. 0 251 446 published Jan. 7, 1988. When a particularserine protease is referred to, e.g. subtilisin, the term precursorsubtilisin and recombinant subtilisin are used consistent with thesedefinitions.

In addition to naturally occurring and recombinant serine proteases, theterm "precursor serine protease" also includes "synthetic serineproteases" which are naturally occurring or recombinant serine proteaseswhich contain one or more amino acid residues which are not naturallyoccurring. Such synthetic serine proteases may be made by the methodsdescribed herein or by in vitro transcription-translation methods asdisclosed by Schultz, et al. (1989), Science, 244, 182-188.

As used herein a "serine protease variant" refers to a serine proteasehaving an amino acid sequence which is derived from the amino acidsequence of a precursor serine protease. The amino acid sequence of theserine hydrolase variant is "derived" from the precursor protease aminoacid sequence by the substitution, deletion or insertion of one or moreamino acids of the precursor amino acid sequence. Such modification isof the "precursor DNA sequence" which encodes the amino acid sequence ofthe precursor serine protease. Further, in some instances, the serineprotease variant may be derived from the precursor protease by thedirect chemical modification of one or more side chains of the aminoacid residues of the precursor amino acid sequence.

For example, in one of the preferred embodiments the serine 221 ofBacillus amyloliquefaciens subtilisin is converted to cysteine. Suchconversion may be obtained by modifying the DNA sequence of subtilisinto encode cysteine thereby replacing serine with cysteine in the variantor by the direct modification of serine with an appropriate chemicalagent to form the equivalent variant. See e.g. Neet, K. E., et al(1966), Proc. Natl. Acad. Sci. USA, 56, 1606-1611; Polgar, L., et al.(1966), J. Amer. Chem. Soc., 88, 3153-3154, which describe the chemicalmodification of subtilisin to form thiolsubtilisin.

In the preferred embodiments, the nucleophilic oxygen of the side chainof an active site serine residue is replaced or modified to substitutethe nucleophilic oxygen of that side chain with a different nucleophile.Preferred nucleophiles include --SH, --SeH, --NH₂. The most preferrednucleophile is SH- representing the replacement of serine with cysteine.

In addition, the serine protease variants of the invention include thereplacement or modification of a second amino acid residue in theprecursor serine protease. The modification of this second amino acidresidue is made to perturb the active site to accommodate the changesinduced therein by the substitution or modification of the side chain ofthe active site serine residue. When the active site serine has beenreplaced or modified to produce a side chain which is larger than thenaturally occurring serine side chain (e.g. when cysteine replacesserine or selenium replaces the nucleophilic oxygen), the second aminoacid residue is a residue which upon replacement or modification causesthe new nucleophilic catalytic side chain to be displaced in a way whicheffectively increases the size of the active site. Thus, for example, inthe case of the subtilisin-type serine proteases, the nucleophileresiding at position 221 in Bacillus amyloliquefaciens subtilisin may bemoved by modifying one or more amino acid residues in the s-helix whichis located at or near the active site serine in the precursor serineprotease. In the case of subtilisin, such modification or replacement ispreferably of proline 225 (Caldwell, et al. (1989), J. Cell. Biochem.supp., 13A, 51). Modification of this residue in Bacillusamyloliquefaciens subtilisin, or an equivalent residue in othersubtilisins, is by replacement with a helix-forming amino acid. Chou, P.Y., et al. (1974), Biochemistry, 13,211; Kyte, J., et al. (1982), J.Mol. Biol., 157,105; Rose, G., et al. (1977), J. Mol. Biol., 113, 153.Examples of such amino acids include alanine, leucine, methionine,glutamine, valine and serine. Helix breaking amino acids such as glycineand proline are not preferred. Alternatively, replacement amino acidscomprise those which have a smaller side chain volume than the residuebeing replaced. Chothia (1984), Ann. Rev. Biochem., 53, 537. Replacementwith the aforementioned amino acids or their analogs modifies thediscontinuity (kink) preceding residues 223-237 in the α-helix.

A preferred replacement amino acid for position 225 is alanine. Whensuch a variant is made, the α-helix containing Ser221 is known to movethe --OH nucleophile away from the oxyanion hole. The effect of thisparticular replacement on naturally occurring subtilisin is shown inFIG. 7 which is a stereo view of residues 220 through 230 of wild typeand Pro225Ala subtilisin. The --OH nucleophile of Ser221 is moved awayfrom the oxyanion hole and the catalytic histidine 64 by 0.2 to 0.4Å.When combined with the replacement of Ser221 with cysteine, thisdisplacement of the 221 residue effectively offsets the increase innucleophile size of 0.65Å and 1.03Å, respectively. Pauling, L. (1960) inThe Nature of the Chemical Bond, 3d ed., Cornell Univ. Press, Ithica,N.Y., pp. 246-260.

A characteristic of the serine protease variant of the invention is thatit has peptide ligase activity in aqueous solution which is greater thanthat of a different serine protease containing only the substitution ormodification of the nucleophilic oxygen of the active site serineresidue. Thus, variants such as Ser221Cys/Pro225Ala have greater peptideligase activity than the Ser221Cys variant.

As used herein, "peptide ligase activity" refers to the ability of anenzyme to ligate two or more substrates. Such substrates include theligation substrates described hereinafter as well as known activatedpeptides such as peptide thiobenzyl esters p-chlorophenyl esters(Nakatsuka, et al. (1987), J. Amer. Chem. Soc., 109, 3808-3810),p-nitrophenyl esters and other aryl esters. In addition, activatedesters include alkyl esters such as methyl, ethyl, glycolate, lactate,etc. and akylthiol esters. Peptide ligase activity is measured bycontacting the enzyme of interest with at least two peptides orsubstrates (one of which generally contains an activated carboxyterminus) under conditions favorable for ligation. The kcat, Km and/orkcat/Kmratio is then determined for that enzyme. kcat refers to theturn-over number of the enzyme, and gives the maximal number ofsubstrate molecules that can be converted to product per unit time perenzyme molecule. The Km is usually inversely proportional to theaffinity of substrate(s) for the enzyme. The catalytic efficiency whichis the second order rate constant for the conversion of substrate toproduct is given as the kcat/Km ratio. Variant serine proteases havinghigher kcat/Km ratios for the same peptide ligation have greater peptideligase activity. However, they can also be considered to have greaterpeptide ligase activity if they have improved kcat assuming that thereactions can be run where the enzymes are saturated with substrates.When comparing the peptide ligase activity of two enzymes, each enzymeis preferably contacted with the same peptides under the sameconditions.

In addition to the above described modifications of the active siteserine and a second amino acid residue to form serine protease variantshaving ligase activity, various other replacement or modifications ofamino acid side chains may be made to modify specificity for theligation peptides used to form a desired ligation peptide product. Forexample, the subtilisin variant Ser221Cys/Pro225Ala is derived from thewild type subtilisin from Bacillus amyloliquefaciens. This wild typesubtilisin preferentially hydrolyzes peptides wherein the P1 residue forthe hydrolysis peptide shown in FIG. 2A consists of phenylalanine,tyrosine, tryptophan, leucine, methionine and lysine but not Thr, Val,Ile, Pro or Gly. In addition, preferred hydrolysis peptides should notcontain Ile, Pro, Asp or Glu at the P1' position or samll amino acidssuch as Gly, Ala, Pro, Ser, Asp or Thr at the P2' position. In thesubtilisin variant Ser221Cys/Pro225Ala, the amino acid residues formingthe substrate binding cleft of subtilisin are not modified andaccordingly the various enzyme subsites within this binding cleft whichinteract with one or more of the substrate residues, e.g. P 2 throughP2', are still capable of binding the normal substrate for the wild typesubtilisin. Of course, this variant demonstrates substantially reducedcatalytic activity with such peptide substrate because of themodifications made to form the subtilisin variant.

Such a subtilisin variant, accordingly, is capable of binding ligationsubstrates containing amino acid residues or analogs corresponding to orclosely related to those found in the normal peptide substrate. Thus,the "first ligation substrate" as schematically represented in FIG. 2Bcontains residues Rn through R1 (reading from the amino to carboxyterminus) wherein R1 is a large hydrophobic amino acid or analog similarto that normally found in the P1 position of a subtilisin hydrolysissubstrate. Similarly, R2 corresponds to or is closely related to theamino acid residue (or its analog) normally found in position P2 of thehydrolysis substrate. The group X which is covalently linked to thecarboxy terminus of the first ligation substrate and which activates thefirst ligation peptide and the residues R2" through Rn will be discussedin more detail hereinafter.

The "second ligation substrate" represented schematically in FIG. 2Ccontains residues R1' through Rn' (reading from the amino to carboxyterminus). When used with the above identified subtilisin variantmodified at at residues 221 and 225, R1' corresponds to the P1' aminoacid residue (or its analog) in the normal subtilisin hydrolysissubstrate. With regard to second ligation substrate residue R2' thisresidue comprises a large hydrophobic amino acid residue (or its analog)which is similar to or corresponds closely to the P2' residue of thenormal subtilisin hydrolysis substrate. For both the first and secondligation substrates, other amino acid residues may be chosen tocorrespond to or be closely related to the amino acid residues (or theiranalogs) found in normal hydrolysis substrate. Appropriate R and R'residues for other serine protease are chosen but not limited to thenatural hydrolysis substrates for such proteases.

In addition, either the first or second ligation substrate may includeother compounds that result in the site specific modification of theproduct. The specific compounds may be chosen to specifically targetcertain sites within the product. For example, the first or secondligation substrate may contain heavy metal ions, that will result inheavy metal derivatives of the ligation product, useful in the X-raycrystallographic elucidation of the ligation product structure. Thefirst or second ligation substrate may also contain modified orisotopically labelled amino acids for biophysical studies.

Furthermore, the conditions under which the ligation reactions arecarried out may be modified as needed to allow ligation to occur. Thesemodifications may be required to allow the ligation substrates to takethe correct structural conformation for ligation to occur, withoutsubstantially destroying the ligase activity. For example, theexperimental conditions may include the addition of denaturing reagents,detergents, organic solvents or reducing agents, or alterations in pH ortemperature, amoung others.

The the first and second ligation substrates, as well as theexperimental conditions, may be selected as needed to produce thedesired level of specificity. Some applications of the present inventionmay require lower levels of ligation reaction specificity, while inother applications strict specificities may be desirable.

As indicated herein, the subtilisin variant Ser221Cys/Pro225Ala iscapable of ligating a first ligation substrate (containing an activationgroup) and a second ligation substrate in accordance with the abovedescribed preference for R1 and R2' amino acid residues. The peptideligation activity for this variant is substantially greater than that ofthe Ser221Cys variant for the same ligation substrates. This variant'spreference for first and second ligation substrates having amino acidsPhe, Tyr, Met, Leu, Trp and Lys at position R1 and amino acids Phe, Tyr,Leu and Met at positions R2' provides substantial utility for ligatingblock substrates such as those made with known chemical synthetictechniques.

Synthetic ligation substrates are generally about 15-25 residues longand produce a ligation product having a length of about 50 residues.Such first ligation product may thereafter be ligated with anotherligation substrate to build larger ligation product.

In order to provide broader specificity for substrate ligation, othermodifications may be made to the serine protease variant to changespecificity for the first and/or second ligation substrate. As describedherein, subtilisin variants containing the Ser221Cys/Pro225Alamodifications as well as modifications in the Glu156 and Gly166 residueswere made to modify the specificity of the variant for the R1 residue ofthe first ligation substrate. Three variants of the Ser221Cys/Pro225Alavariant were made. These specific variants included the modifications221 and 225 and further included Gly166Glu, Glu156Gln/Gly166Lys andGly166Ile, Using the standard one-letter symbols for amino acidresidues, these variants can be identified as G166E/S221C/P225A,E156Q/G166K/S221C/P225A and G166I/S221C/P225A respectively. Byintroducing these mutations into the S221C/P225A variant, the peptideligase specificity was substantially altered. See FIG. 11 wherein theligation of various first ligation substrates (discussed in more detailhereinafter) and the second ligation substrate Ala-Phe-amide is shown.

For a small R1 ester substrate (s-Ala-Ala-Pro-Ala-glc-Phe-amide) theG166I/S221C/P225A variant (ICA) is substantially better than the others(FIG. 11). The E156Q/G166K/S221C/P225A variant (QKCA) efficientlyaminolyses a Glu R1 first ligation substrate with a second ligationsubstrate Ala-Phe-amide (AF-amide). It also has greater peptide ligationactivity than the other variants toward a Phe R1 first ligationsubstrate. For a Lys R1 ester substrate the rates for three of thevariants (including the complementary charged G166E/S221C/P225A variant(ECA)) are comparable and much more active than for the like chargedmutant, E156Q/G166K/S221C/P225A. For ligation of peptide substratescontaining an R1 Arg ester, our preliminary data indicatesG165E/S221C/P225A is substantially more active than the parent ligase.See Table I.

In general, the aminolysis rates for the optimal enzyme ligation peptidepair were comparable indicating it should be possible to efficientlyligate Lys, Ala, Phe, and Glu R1 ligation substrates with the properchoice of S221C/P225A based ligase S221C/P225A (CA). Except for the LysR1 ligation substrate, at least one of the three other specificityvariants were significantly better than the parent peptide ligase. Theseadditional variant enzymes should provide added flexibility in design ofligation junctions.

These results are consistent with the demonstrated specificity ofsubtilisin variants G166E, E156Q/G166K and G166I for hydrolysis ofpeptides containing Lys or Arg, Glu and Ala P1 substrates, respectively(Wells, et al. (1987) Proc. Natl. Acad. Sci. USA, 84, 1219-1223; Estell,et al. (1986), Science, 233, 659-663, and EPO Publication 0 251 446). Itis expected that other modifications which are known to cause a changeand/or shift in substrate specificity for various P and P' residues inwild type subtilisin can be effectively combined with the Ser221/Pro225or equivalent modifications to further modify the specificity of theserine hydrolase variant for various first and second ligationsubstrates.

                                      TABLE I                                     __________________________________________________________________________    Summary of preferred sequences for ligating peptides                          using variants of subtilisin.                                                 Residue                                                                       P4           P3  P2  P1    R1'  R2'                                                                              R3'                                        __________________________________________________________________________    Avoid:               G,P,T,V,I                                                                           I,P,D,E.sup.a                                                                      P,G                                           Preferred:                                                                          Small  flexible                                                                          flexible                                                                          M,Y,L(1).sup.b                                                                      R,C,N                                                                              F,Y                                                                              flexible                                         or large       F(1,4)                                                                              T,K,H                                                                              L,M                                                 hydrophobics   K(1,2,3)                                                                            W,Q,Y                                                                              R,K                                                                R(3)  A,V,S,G                                                                 A(2)                                                                          E(4)                                                     __________________________________________________________________________     .sup.a The deleterious effects of the Glu and Asp sidechains can be           minimized in high salt (>1M; Carter, et al., (1989), Proteins, 6,             240-248).                                                                     .sup.b These residues are preferred with the following variants of            thiolsubtilisin: (1) = S221C/P225A; (2) = G166I/S221C/P225A; (3) =            G166E/S221C/P225A; (4) = E156Q/G166K/S221C/P225A.                        

Many of these modifications to Bacillus amyloliquefaciens subtilisin aredisclosed in EPO Publication 0 130 756 published Jan. 9, 1988, and EPOPublication No. 0 251 446, published Jan. 7, 1988. Such modificationsmay be readily made to the Ser221/Pro225 variants described herein (aswell as other variants within the scope of the invention) to formvariants having a wide range of specificity for first and secondligation substrates. The methods disclosed in these EPO publications maybe readily adapted to modify the DNA encoding the variants describedherein.

In addition to combining modifications which affect substrate substratespecificity, it is also possible to combine other modifications whichaffect other properties of the serine protease. In particular,modifications have been made to subtilisin which affect a variety ofproperties which may be desirable to combine with the variants of thepresent invention. For example, substitution of methionine at position50 Phe or Cys and at position 222 with Ala, Gly, Ser and Cys results ina subtilisin variant which is oxidatively stable as compared to wildtype subtilisin. In addition, there are known modifications tosubtilisin which result in increased thermal stability, alkalinestability and changes in pH activity profile. See e.g. EPO Publication 0251 446 published Jan. 7, 1988. The invention contemplates combiningthese and other possible modifications to form variants which inaddition to having ligation activity are also characterized by changesin one or more other properties of the precursor enzyme. For example,serine protease variants which resist inactivation at highertemperatures than that of the variant not containing such modificationsare useful to ligate first and second ligation substrates wherein anincrease in reaction temperature facilitates a partial or completedenaturation of one or more of the ligation peptides to increase theligation yield. Similarly, variants which have optimal activity at a pHwhich also facilitates denaturation of ligation substrate may proveuseful in specific applications. Of course, various other modificationsnot presently known but which are found to confer desirable propertiesupon the serine protease variants of the invention are contemplated tobe within the scope of the invention.

Although activated aryl-ester substrates (such as thiobenzyl esters) aremore efficient than corresponding alkyl-esters to acylate subtilisin,aryl-esters are more difficult to synthesize and inherently less stable.A series of alkyl-ester substrates was prepared to improve upon theircatalytic efficiencies as donor substrates for acyl-enzyme intermediateformation. Peptide substrates bind to subtilisin in an extendedanti-parallel β-sheet conformation from residues P4 to P3' (McPhalen andJames (1988), Biochemistry, 27, 6592-6598). Although the P4 and P1residues dominate the substrate specificity of the enzyme (for reviewsee Philipp and Bender (1983), Mol. Cell. Biochem., 51, 5-32; Estell etal. (1986), Science, 233, 659-663), the catalytic efficiency forhydrolysis is enhanced significantly when peptide substrates areextended from P1' to P3' (Morahara et al. (1970), Arch. Biochem.Biophys., 138, 515-525).

Referring to FIGS. 2B and 2C, first and second ligation substrates areshown with R1 and R2' groups being as previously described. The leavinggroup X in the ester in FIG. 2B may be any of the following organicalcohols or thiols: C₆ -C₁₂ aryl where the aryl group is unsubstitutedor substituted by one or more of the groups nitro, hydroxy, halo (F, Cl,Br, I), C₁ -C₈ alkyl, halo-C₁ -C₈ alkyl, C₁ -C₈ -alkoxy, amino,phenyloxy, phenyl, acetamido, benzamido, di-C₁ -C₈ alkylamino, C₁ -C₈alkylamino, C₆ -C₁₂ aroyl, C₁ -C₈ alkanoyl, and hydroxy-C₁ -C₈ alkyl, C₁-C₁₂ alkyl either substituted or unsubstituted, branched, straight chainor cyclo where the substituents are selected from halo (F, Cl, Br, I),C₁ -C₈ alkoxy, C₆ -C₁₂ aryloxy where the aryl group is unsubstituted orsubstituted by one or more of the groups nitro, hydroxy, halo (F, Cl,Br, I ) , C₁ -C₈ alkyl, C₁ -C₈ -alkoxy, amino, phenyloxy, acetamido,benzamido, di-C₁ -C₈ alkylamino, C₁ -C₈ alkylamino, C₆ -C₁₂ aroyl, andC₁ -C₈ alkanoyl, isothioureido, C₃ -C₇ cycloalkyl, ureido, amino, C₁ -C₈alkylamino, di-C₁ -C₈ alkylamino, hydroxy, amino-C₂ -C₈ alkylthio,amino-C₂ -C₈ alkoxy, acetamido, benzamido wherein the phenyl ring isunsubstituted or substituted by one or more of the groups nitro,hydroxy, halo (F, Cl, Br, I), C₁ -C₈ alkyl, C₁ -C₈ -alkoxy, amino,phenyloxy, acetamido, benzamido, di-C₁ -C₈ alkylamino, C₁ -C₈alkylamino, C₆ -C₁₂ aroyl, C₁ -C₈ alkanoyl, C₆ -C₁₂ arylamino whereinthe aryl group is unsubstituted or substituted by one or more of thegroups nitro, hydroxy, halo, C₁ -C₈ alkyl, C₁ -C₈ -alkoxy, amino,phenyloxy, acetamido, benzamido, di-C_(16l) -C₈ alkylamino, C₁ -C₈alkylamino, C₆ -C₁₂ aroyl, and C₁ -C₈ alkanoyl, guanidino, phthalimido,mercapto, C₁ -C₈ alkylthio, C₆ -C₁₂ arylthio, carboxy, carboxamide,carbo-C₁ -C₈ alkoxy, C₆ -C aryl wherein the aryl group is unsubstitutedor substituted by one or more of the groups nitro, hydroxy, halo, C₁ -C₈alkyl, C₁ -C₈ alkoxy, amino, phenyloxy, acetamido, benzamido, di-C₁ -C₈alkylamino, C₁ -C₈ alkylamino, hydroxy-C₁ -C₈ alkyl, C₆ -C₁₂ aroyl, andC₁ -C₈ alkanoyl, and aromatic heterocycle wherein the heterocyclicgroups have 5-10 ring atoms and contain up to two O, N, or Sheteroatoms.

In one aspect of the invention, X is preferably a 2-hydroxy carboxylicacid. The general formula for a 2-hydroxy carboxylic acid is shown inFIG. 9. As can be seen therein, the core structure of the 2-hydroxycarboxylic acid is similar to the core structure for an amino acidexcept for the replacement of the amino group for the 2-hydroxy group.Accordingly, an appropriate side chain R group may be chosen for the2-hydroxy acid to correspond to the side chain R groups found onnaturally occurring amino acids. Thus, the various 2-hydroxy carboxylicacids e.g. glycolate corresponding to the amino acid glycine and lactatecorresponding to the amino acid alanine, etc. may be esterified with thecarboxy terminus of a substrate to form a first ligation substrate.

First ligation substrates were constructed wherein the 2-hydroxycarboxylic acid was glycolate or lactate. In essence, the 2-hydroxycarboxylic acid acts as an amino acid residue which may bind to thatportion of the enzyme binding cleft which interacts with the P1' residueof a hydrolysis substrate. Further, the free-carboxyl group of the2-hydroxy carboxylic acid may be amidated to form an amide or amidatedwith an amino acid residues (or analog) or peptide (or peptide analog)represented by R2" or R2" through Rn", respectively, as shown in FIG.2B. As indicated therein, the R2" is preferably chosen to optimize theinteraction with that portion of the binding cleft of the enzyme whichnormally interacts with the hydrolysis substrate residue P2'. Similaranalogies exist for other R" residues. The leaving group thus obtainedactivates the first ligation substrate such that the activation energyfor ester cleavage is lowered by optimizing binding to the serineprotease variant.

As used herein, a "ligation product" is formed by ligation of first andsecond ligation substrate by the serine hydrolase variants of theinvention. It is to be understood, however, that ligation products andthe first and second ligation peptides need not be made entirely ofnaturally occuring amino acids and further need not be entirelyproteinaceous. In this regard the only requirement of first and secondligation substrates is that they contain at least an R1 amino acid orfunctional analog thereof (at the carboxy terminus of the first ligationpeptide) and an amino acid or functional analog at the R1' position ofthe second ligation peptide. Such R1 and R1' residues are capable ofbinding in the appropriate portion of the substrate binding cleft of theserine hydrolase variant such that ligation occurs. Specificallyincluded in R1 and R1' are those amino acid residue analogs which arefunctional with the serine protease variants of the invention. Suchanalogs include L-selenocysteine, L-selenomethionine andL-propargylglycine (sigma).

To the extent that such binding and ligation also requires R2 and R2'amino acids (or their analogs) or additional amino acids R3 or R3', etc.(or their analogs) the first and second ligation substrates peptideswill contain such structure. However, first and/or second ligationsubstrates may contain non-naturally occurring amino acids at positionsoutside of the region required for binding and ligation. Further, sinceit is only necessary for the first and second ligation substrates tohave sufficient binding to bring about ligation, the first and secondligation substrates and the corresponding ligation product formedtherefrom may contain virtually any chemical structure outside of thenecessary binding and ligation region. Accordingly, the invention is notlimited to ligation substrates and ligation products which correspond toa polypeptide or protein containing naturally occurring amino acids.

The efficiency of peptide ligation using a series of glycolate andlactate-esters with S221C/P225A subtilisin was analyzed (Table II). Asindicated, there is a systematic increase in kcat/KM of about 10-fold inextending esters from -glc-amide through -glc-Phe-Gly-amide. Most ofthis increase is the result of lower KM values. There is a similarprogression starting from -lac-amide that further illustrates theadvantage of extending the ester chain length. The lactate-ester seriesis generally 4- to 5-fold less reactive than the glycolate-ester seriesand contains an additional chiral center. Therefore, because of theimproved catalytic efficiency and ease to synthesize, the -glc-Phe-amideester substrate was further studied.

                  TABLE II                                                        ______________________________________                                        Substrate leaving group comparison. The acyl-donor part is in                 all cases s-AlaAlaProPhe and is                                               followed by the alternative                                                   leaving groups. The initial rate of ligation using the dipeptide              AlaPhe-amide (3.1 mM) was measured. .sup.a The values of                      kcat/KM, the aminolysis fraction of the apparent second order                 rate constant for the reaction between the substrate and the                  enzyme, are the most reliable (error of about 20%).                            leaving group.sup.a                                                                      kcat(s.sup.-1)                                                                         KM (mM)                                                                                 ##STR1##                                       ______________________________________                                        --glc-amide                                                                              9.5      3.0       3.20 × 10.sup.3                           --glc--Phe-amide                                                                         20.0     1.3         15 × 10.sup.3                           --glc--Phe--Gly-                                                                         20.0     0.6         34 × 10.sup.3                           amide                                                                         --lac-amide                                                                              3.8      4.5       0.86 × 10.sup.3                           --lac--Leu-amide                                                                         2.3      1.9        1.2 × 10.sup.3                           --lac--Phe-amide                                                                         2.9      1.0        2.9 × 10.sup.3                           --Sbz      230.0    0.4        650 ×  10.sup.3                          ______________________________________                                    

The P225A variant rapidly and quantitatively hydrolyzes the-glc-Phe-amide-ester with very little aminolysis (FIG. 10A). The $221Cvariant aminolyzes the substrate slowly and hydrolyzes the adduct(ligation product) so that during this time almost one-third of thesubstrate becomes hydrolyzed (FIG. 10B). However, the S221C/P225Avariant gives rapid and almost quantitative aminolysis (>90%; FIG. 10C).Moreover, as expected from the very low amidase activity of S221C/P225A(Table III), the aminolysis product (ligation product) was notdetectably hydrolyzed unlike the result using the S221C or P225Avariants (data not shown). On the basis of rapid aminolysis and slowligation product hydrolysis, S221C/P225A is a more useful enzyme thaneither of its parent single mutants for ligation of substrates using the-glc-Phe-amide-ester donor substrate.

Table III: Kinetic constants for the hydrolysis of an amide substrate(s-Ala-Ala-Pro-Phe-pNA) and for an activated ester substrate(s-Ala-Ala-Pro-Phe-Sbz). The ratio of esterase to amidase activities isthe ratio of the apparent second order rate constants (kcat/KM). Theaminolysis to hydrolysis ratio was investigated using the thiobenzylester substrate with 3.6 mM of the dipeptide Ala-Phe-amide as thenucleophile.

                                      TABLE III                                   __________________________________________________________________________            s-Ala--Ala--Pro--Phe--pNA                                                                           s-Ala--Ala--Pro--Phe--Sbz                                                                           esterase/                                                                          aminolysis/          enzyme  kcat    KM     kcat/MM                                                                              kcat    KM     kcat/MM                                                                              amidase                                                                            hydrolase            __________________________________________________________________________    wild type.sup.a                                                                       (4.4 ± 0.01) ×                                                               (1.8 ± 0.1) ×                                                               (2.5 ± 0.1) ×                                                               (2.3 ± 0.1) ×                                                                (1.9 ± 0.1) ×                                                               (1.2 ± 0.1)                                                                       4.8                                                                                5.6 ×                                                                   10.sup.-3                    10.sup.1                                                                              10.sup.-4                                                                            10.sup.5                                                                             10.sup.3                                                                              10.sup.-4                                                                            10.sup.7                         Pro225Ala                                                                             .sup. 4.1 ± 0.04                                                                   (7.8 ± 0.2) ×                                                               (5.2 ± 0.1) ×                                                               (2.3 ± 0.03) ×                                                               (3.8 ± 0.1) ×                                                               (6.2 ± 0.1)                                                                       1.2                                                                                1.6 ×                                                                   10.sup.-3                            10.sup.-4                                                                            10.sup.3                                                                             10.sup.3                                                                              10.sup.-4                                                                            10.sup.6                         Ser221Cys                                                                             (1.3 ± 0.04) ×                                                               (4.9 ± 0.4) ×                                                               (2.7 ± 0.1)                                                                       .sup. 1.4 ±  0.1                                                                   (5.5 ± 0.1) ×                                                               (2.5 ± 0.3)                                                                       9.3                                                                                3.0 ×                                                                   10.sup.2                     10.sup.-3                                                                             10.sup.-4             10.sup.-5                                                                            10.sup.4                         221Cys/225Ala                                                                         <3 × 10.sup.-5                                                                  ND     ND     (4.1 ± 0.2) ×                                                                (1.9 ± 0.2) ×                                                               (2.1 ± 0.2)                                                                       NDimes.                                                                            2.5 ×                                                                   10.sup.1                                           10.sup.1                                                                              10.sup.-4                                                                            10.sup.5                         __________________________________________________________________________     .sup.a Data for the pNA substrate and for the Sbz substrate from Carter       and Wells (1988), Nature, 332, 564-568; and Wells, et al. (1986), Phil.       Trans. R. Soc. Lond., A317, 415-423, respectively.                       

Sequence requirements for the nucleophilic acceptor peptide (secondligation substrate) were investigated by determining the ligationefficiency of a series of acceptor dipeptides having the form NH₂-R1'-Phe-amide, where R1' corresponds to the amino-terminal residue ofthe second ligation (acceptor) substrate that resides in the R1' bindingsite during attack of the thioacyl-enzyme intermediate. As the R1'residue is varied in size or charge (Gly, Ala, Leu, Arg) the apparentsecond order rate constant for aminolysis ofs-Ala-Ala-Pro-Pro-Phe-glc-Phe-amide varies less than 7-fold (Table IV).This is consistent with the relatively broad specificity for hydrolyzingvarious P1' peptide substrates (Carter et al. (1989), Proteins, 6,240-248).

                  TABLE IV                                                        ______________________________________                                        Comparison of efficiency in the aminolysis reaction of                        different di- and tri-amino acid peptides. The apparent                       second order rate constants for the reaction of the                           nucleophile with the peptide ligase acylated by s-Ala--                       Ala--Pro--Phe are compared.                                                   peptide.sup.a       peptide                                                   ______________________________________                                        GF           0.8        GA         0.1                                        AF           1.0        GL         0.3                                        LF           0.3        FG         0.006                                      RF           2.0        AFA        2.0                                        RG           0.04       LFD        0.3                                        ______________________________________                                         .sup.a The oneletter codes are used. All peptides were amidated at the        carboxy terminus.                                                        

The R2' site was probed with a series of dipeptides having the form NH₂-Gly-R2'-amide (Table IV). Although there is a preference for largerhydrophobic amino acids, the rate constant for ligation varies only8-fold in going from Ala to Leu to Phe for S221C/P225A. For somedipeptide combinations the difference can be much larger. For example,Arg-Phe is nearly 100-fold faster to aminolyze than Arg-Gly. Moreover,bad combinations, such as a large hydrophobic amino acid at R1' and Glyat R2' can make for extremely poor ligation substrates (compare NH₂-Phe-Gly-amide with NH₂ -Gly-Phe-amide; Table IV). Extending thenucleophilic peptide can enhance the catalytic efficiency of ligation 2-to 3-fold (compare NH₂ -Ala-Phe-amide with NH₂ -Ala-Phe-Ala-amide; TableIV).

As a test for ligation of two large ligation substrates, a peptide esterwas synthesized containing the first eight amino acids of hGH (FPTIPLSR)esterified to glycolate-Phe-amide (first ligation peptide). The acceptorpeptide fragment was des-octa hGH, that contained residues 9-191 of hGH(second ligation peptide). The G166E/S221C/P225A variant producedligation product after 80 min. having the expected molecular weight(FIG. 12). Amino-terminal sequencing of the first 10 residues of theproduct showed that a single FPTIPLSR fragment was ligated properly tothe N-terminus of des-octa hGH beginning with sequence Leu-Phe-Asp toproduce the full-length hormone. The parent peptide ligase (S221C/P225A)was significantly less efficient at ligating the two hGH peptidefragments. The improved efficiency of the G166E/S221C/P225A enzyme overthe parent ligase is attributed to its increased activity for Arg R1substrates due to the G166E substitution in the P1 binding site.Polymerization of the unprotected peptide ester was not observed. Thisis most likely the result of the Pro residue at the second amino acidresidue of the first ligation peptide which is a very poor P2' residue(Carter and Wells, (1989), supra).

To demonstrate the utility of the ligase (S2221C/P225A), a nonomer esterfirst ligation substrate (FPTIPAAPF) was constructed that mimicked theoptimal enzyme substrate (s-Ala-Ala-Pro-Phe-glc-Phe-amide). This peptideester was ligated onto des octa hGH (second ligation substrate) byS221C/P225A (FIG. 12). Protein sequencing of the ligation product onlygave the expected amino-terminal sequence, not the unreactedamino-terminus of des-octa hGH. This suggests that ligation onlyoccurred to the α-amino-group of hGH and not to ε-amino-groups oflysine.

This semi-synthesis of hGH is the first example of ligation of peptidefragments in aqueous to form a large polypeptide. Previous ligation withthiolsubtilisin in solutions containing greater than 50% DMF(dimethylformamide) reportedly produced a peptide having a length of nolonger than 17 amino acid residues. Nakatsuka, et al. (1987), J. Am.Chem. Soc., 109, 3808-3810. In the ligation described herein, first andsecond ligation peptides were ligated with a serine protease variant ofthe invention to form a ligation product having 191 amino acid residues.Further, such ligation was accomplished in aqueous solution containingless than 2% by volume of nonaqueous solvent and did not require thatpeptide side-chains contain protector groups.

As used herein, the term "aqueous solution" refers to any solution thatcontains water. In some instances, an aqueous solution may comprise aslittle as 1% to approximately 5% water. However, an aqueous solutiontypically comprises greater than about 50% to 100% aqueous solution(excluding solutes). Accordingly, solutions containing a smallpercentage of nonaqueous solvent are considered to be within the scopeof the definition of aqueous solution.

It is to be understood, however, that to the extent that the variants ofthe invention are only defined with regard to the peptide ligaseactivity of the serine protease variant in aqueous solution as comparedto a variant modified only at the catalytic serine. This definition doesnot preclude use of the serine protease variants of the invention insolutions containing little or no water.

The following is presented by way of example and is not to be construedas a limitation of the scope of the claims.

Materials and Methods

Abbreviations: DMA, dimethylacetamide; DMSO, dimethylsulfoxide; DTNB,5,5'-dithiobis(2-nitrobenzoic acid); DTT, DL-dithiothreitol; hGH, humangrowth hormone; NEM, N-ethyl maleimide; PAGE, polyacrylamide gelelectrophoresis; SDS, sodium dodecyl sulfate; s-Ala-Ala-Pro-Phe-pNA,N-succinyl-L-Ala-L-Ala-L-Pro-L-Phe-para-nitroanilide;s-Ala-Ala-Pro-Phe-Sbz, the thiobenzyl ester of the same succinylatedpeptide; TFA, trifluoroacetic acid; Tricine, N-tris(hydroxymethyl)methylglycine; Tris, tris(hydroxymethyl) aminomethane; E-Ac, acyl- orthioacyl-enzyme intermediate. Mutant proteins are designated by thewild-type residue (single-letter amino acid code) followed by theirposition and the mutant residue. Multiple mutants are separated by aslash. For example, S221C/P225A indicates that the serine at position221 and the proline at position 225 have been replaced by cysteine andalanine, respectively. Protease substrate residues are designated usingthe nomenclature of Schechter and Berger (1967), ##STR2## where thescissile peptide bond is between the P1 and P1' residues.

Materials: Enzymes for DNA manipulations were from New England Biolabsor Bethesda Research Labs. Oligonucleotides were synthesized by theOrganic Chemistry Department at Genentech. All peptides contain L-aminoacids unless otherwise indicated, and were synthesized by standardmethods (Barany and Merrifield, 1979). DL-Dithiothreitol (DTT)¹, DTNB,2-mercapto ethanol, NEM, TFA, Tween 80, Tricine, dimethylsulfoxide,dimethylacetamide, and the substrates s-Ala-Ala-Pro-Phe-pNA ands-Ala-Ala-Pro-Phe-Sbz were from Sigma. The solvents ethanol andacetonitrile were from J. T. Baker Inc. and ammonium sulfate from ICNBiochemicals Inc. Dipeptides Ala-Phe-amide, Arg-Gly-amide,Arg-Phe-amide, Gly-Ala-amide, Gly-Leu-amide, Gly-Phe-amide,Leu-Phe-amide, Phe-Gly-amide and the tripeptide Ala-Phe-Ala-amide wereobtained from BACHEM Feinchemikalien AG. The tripeptideLeu-Phe-Asp-amide was synthesized according to the general methodsdescribed in G. Barany and R. B. Merrifield (1979), in "Solid-PhasePeptide Synthesis" in the Peptides, Analysis, Synthesis Biology SpecialMethods i Peptide Synthesis, Part A, Vol. 2 (E. Grow, J. Meienhatter(eds.), N.Y., Academic Press), pp. 3-254. Activated thiol Sepharose aswell as G-25 and G-75 Sepharose were obtained from Pharmacia LKBTechnology AB.

Expression and Purification of Variant Subtilisins: The subtilisin genein the M13-E. coli- B. subtilis shuttle plasmid, pSS5 (Carter and Wells(1987), Science, 237, 398-399), was expressed in the B. subtilis hoststrain (BG2036) that is lacking its endogeneous subtilisin and neutralprotease genes (Yang, M. Y., et al. (1984), J. Bacterial, 160, 15-21).Since maturation of subtilisin involves proteolytic removal of theprosequence (Power et al. (1986), Proc. Natl. Acad. Sci. USA, 83,3096-3100), the variants with reduced protease activity were expressedin the presence of active subtilisin. This was done either by adding asmall amount of purified subtilisin (to a final concentration of 500μg/L) late in the logarithmic growth phase, or by co-culturing from aninoculum of BG2036 containing 0.1% wild type subtilisin expressing cells(Carter and Wells, (1987), supra).

The purification of inactive subtilisin variants was essentially asdescribed (Carter and Wells (1987), supra) except that an equal amountof cold ethanol was added to the supernatant to precipitate impuritiesprior to the precipitation of subtilisin by the addition of twoadditional volumes of cold ethanol. Furthermore, the CM-Trisacryl wassubstituted by SP-Trisacryl M in the ionexchange chromatography step.For the S221C mutants, the active site cysteine was utilized forpurification on activated thiol Sepharose. This latter step is essentialin order to separate the variant proteins from any traces of wild-type"helper" subtilisin. The equivalent step in the original procedure useda cysteine residue introduced on the surface of the protein that resultsin efficient removal of wild-type activity (Carter and Wells (1987),supra.; Carter and Wells (1988), Nature, 332, 564-568). The mutant P225Ais capable of autoproteolytic processing, and therefore was culturedwithout helper subtilisin and was purified by standard procedures(Estell et al. (1985), J. Biol. Chem., 260, 6518-6521).

Kinetic Assays. The esterase and amidase activities were obtained frominitial rate measurements using a Kontron Uvikon 860 spectrophotometer.The assay for esterase activity utilized the substrates-Ala-Ala-Pro-Phe-Sbz at (25°±0.2° C.) in 100 mM Tris-HCl (pH 8.60), 4%(v/v) dimethylsulfoxide, 0.005% (v/v) Tween 80. With the non-cysteinecontaining proteases, DTNB (Ellman (1959), Arch. Biochem. Biophys., 82,70-77) was added to a concentration of 37.5 μM to visualize the releaseof thiolbenzoate upon hydrolysis of the substrate. With the S221Cderivative proteases the difference in absorbance at 250 nm between thesubstrate and the hydrolyzed product was used to monitor the reactiondirectly. The amidase activities were measured under identicalconditions by following the increase in absorbance at 412 nm uponhydrolysis of p-nitroanilide from s-Ala-Ala-Pro-Phe-pNA.

Enzymatic ligation of peptides were performed at (25°±0.2° C. in 90 mMTricine (pH 8.0), 2% (v/v) dimethylacetamide, 0.005% (v/v) Tween 80. Thecomparison between substrates having different leaving groups wassimplified by measuring initial reaction rates at a low substrateconcentrations (70-75 μM) and at a higher concentration (1.33mM) whererates are proportional to kcat/KM and the kcat, respectively (Fersht(1977) in Enzyme Structure and Mechanism, W. H. Freeman & Co. USA). Thehigher substrate concentration may still be below the KM for some of thesubstrates, so that values for kcat are less accurate. The aminolysiswith di- and tripeptides was performed at low peptide concentrations.The aminolysis rate V=k_(aminolysis) [N][E-Ac]/ KN, where [N] is thenucleophile concentration, [E-Ac] is the concentration of acyl-enzymeintermediate and KN is the dissociation constant for the binding of thenucleophile to the acyl-enzyme intermediate (Riechmann and Kasche(1985), Biochem. Biophys. Acta., 830, 164-172). SinceKN=[N][E-Ac]/[N.E-Ac] a change in [N] will result in a change in [E-Ac]but at low [N], [E-Ac]>>[N.E-Ac] and k_(aminolysis) / KN (the apparentsecond order rate constants for the reaction between the differentnucleophile and the acyl-enzyme intermediate) may be compared. Theconcentrations of the peptide nucleophiles, and the calibration of theabsorbance data for the different ligation products was obtained fromamino acid composition analysis. The rates of peptide ligation weremeasured at four or five different concentrations for each nucleophile.

Ligation reactions were analyzed by taking aliquots at different timesand analyzing the peptide products by C-18 reversed phase HPLC. Peptideswere eluted in a gradient of acetonitrile/0.1% TFA in water/0.1% TFA andthe absorbance at 214 nm monitored. Amino acid composition analysis wasused to confirm both the hydrolysis and aminolysis products and tocalibrate the absorbance values. The structures of the hydrolysis andaminolysis products (using the dipeptide Ala-Phe-amide as a nucleophile)were confirmed by mass spectrometrical analysis.

EXAMPLE 1 Production of Subtilisin Variants

Molecular modeling was performed on an Evans and Sutherland PS300 usingthe program FRODO (Jones (1978), J. App. Crystallogr, 11, 268-272) andcoordinates from a 1.8 Å resolution structure of subtilisin BPN' fromBacillus amyloliquefaciens (Bott et al. (1988), J. Biol. Chem., 263,7895-7906). The S221C mutation was introduced (Carter et al. (1986),Nucl. Acids Res., 13, 4431-4443)) into the wild-type subtilisin gene(Wells et al. (1983), Nucl. Acids Res., 11, 7911-7925) using theoligonucleotide 5'-ACAACGGTACCTGGCATGGCATCTCC (asterisks indicate thepositions of altered nucleotides and underlined is a unique KpnI site).The S221C/P225A mutations were introduced into the S221C template usingthe oligonucleotide 5'-GCGTACAACGGTACTTGCATGGCA TCTGCGCACGTTGCC(asterisks indicate altered nucleotides and underlined is a new FspIsite) by restriction-selection against the KpnI site (Wells et al.(1986), Phil. Trans. R. Soc. Lond., A317, 415-423). The construction ofthe mutants G166E and E156Q/G166K was described by Wells et al. (1987),Proc. Natl. Acad. Sci. USA, 84, 1219-1223, and the mutant G166I byEstell et al. (1986), Science, 223, 659-663. Combinations of themutations around the active site (positions 221 and 225) and around theP1 binding pocket (156 and 166) were obtained by ligation of mutatedrestriction fragments split by the enzyme PpuMI. See EPO Publication No.0 251 446, published Jan. 9, 1988. All mutants were verified by dideoxysequencing (Sanger et al. (1977), Proc. Natl. Acad. Sci. USA, 83,3096-3100). The mutated gene encoding P225A mutant was a kind gift fromT. Graycar (Genencor, S. San Francisco, Calif.). It was synthesized byprimer extension mutagenesis on a single stranded M13 subclone ofBacillus amyloliquefaciens subtilisin using the mutagenicoligonucleotides depicted in FIG. 13.

The above identified subtilisin variants were used in conjunction withthe ligation substrates disclosed in Example 2 and Example 3 to providethe previously discussed results.

EXAMPLE 2 Synthesis of FPTIPAAPF-glycolate-F-amide

The synthesis of a C- terminal amide ligation peptide is accomplished byattachment of the first Boc protected amino acid (Boc-Phenylalanine) to4-methylbenzhydrylamine resin (Bachem L. A.) usingdiisopropylcarbodiiimide in methylenechloride. Standard Boc syntheticprotocols for the synthesis of the protected peptide are followed.Barany, G., et al. (1979), in "Solid-Phase Peptide Synthesis", in ThePeptides, Analysis, Synthesis, Biology. Special Methods in PeptideSynthesis, Part A. Vol. 2 (E. Grow, J. Meienhoffer (eds), New YorkAcademic Press), pp. 3-254. The glycolic acid residue was incorporatedas the corresponding t-butyl ether. Removal of the t-butyl ether with50% trifluoroacetic acid in methylene chloride and coupling of thesubsequent amino acid with diisopropylcarbodiimide and 10 mol%dimethylaminopyridine in 90% methylene chloride, 10% dimethylacetamide,afforded the ester linkage. Subsequent amino acids were incorporatedagain using standard Boc protocols. The crude peptide was deprotectedand removed from the resin with hydrogen fluoride. The crude peptide wasthen purified by reverse phase HPLC. Purity was determined by massspectral analysis. M+1 calc. 1164.5, M⁺¹ obs. 1164.6. Similar methodsare used to produce other first ligation peptides.

EXAMPLE 3 Growth Hormone Expression and Semi-Synthesis

The truncated form of human growth hormone (des-octa hGH containingresidues 9-191) was expressed in E. coli W3110 (tonA; ATCC27325) usingthe E. coli alkaline phosphatase promoter and the signal peptide from E.coli heat-stable enterotoxin II (Chang et al. (1987), Gene, 55,189-196). Cell paste was resuspended in four volumes of 10mM Tris HCl,pH 8.0, to release the hGH from the periplasmic space. The cells werepelleted and the hGH was purified from the supernatant (Olson et al.(1981), Nature, 293, 408-411).

Peptides derived from the amino-terminal sequence of hGH having thesequence FPTIPLSR or FPTIPAAPF on their carboxyl-termini were esterifiedwith the leaving group glycolate-Phe-amide. The reaction betweendes-octa hGH (0.5 mM) and either of the two peptide substrates (2.4 mMfinal) was performed in Tricine (pH 8.0) at 20° C. using eitherS221C/P225A or G166E/S221C/P225A subtilisin at final concentrations of3.0 μM or 3.4 μM, respectively. The reaction was stopped by mixing analiquot with an equal volume of 100mM NEM (which alkylates the activesite cysteine). Loading buffer (5% 2-mercaptoethanol, 5% glycerol,10mMTris HCl pH 8.0, 1 mM EDTA, 0.25% SDS, final concentrations) wasadded and the samples were boiled and analyzed by SDS-PAGE (Laemmli(1970), Nature, 227, 680-685). See FIG. 12. The ligation products wereblotted onto polyvinylidene difluoride membranes (Matsudaira et al.(1987), J. Biol. Chem., 262, 10035-10038) and the amino-terminalsequences were determined.

EXAMPLE 4 Modification of Protropin with peptides

In addition to ligating peptides to form larger proteins, Protropin(met-hGH) has also been modified with the S221C/P225A peptide ligase.One of the peptides used in the growth hormone semi-synthesis,FPTIPAAPF-glycolyl-F-amide, was used in these ligations. Protropin (75μM) was reacted with the peptide (350 μM to 7 mM) in the presence of 3μM of the ligase in 115 μl of the reaction buffer (10 mM Tricine, pH8.0) at 25° C. The progress of the ligation reaction was monitored bymixing an aliquot (15 μl) with an equal volume of 100 mM NEM at timeintervals between 1 minute and 1 hour. The samples were then run on aSDS-PAGE gel (Laemmli (1970), Nature 227, 680-685). FIG. 14 shows thatmost of the substrate is ligated after an hour, as shown by theappearance of a product of expected molecular weight.

EXAMPLE 5 Confirmation of ligase substrate specificity

To demonstrate that the ligase substrate specificity is similar to thesubstrate specificity for the hydrolysis of peptides for subtilisin,several different ligations were tried. The substrates used and theresults obtained confirm that ligation is restricted to ligationsubstrates that have amino acid sequences similar to the sequence ofsubtilisin hydrolysis substrates.

The presence of proline at either the P1' or P2' position has been shownto inhibit subtilisin hydrolase activity. Carter et al. (1989),Proteins: Structure, Function and Genetics, Vol. 6., 240-248 (1989). hGHhas a proline at the P2' position while the des-1 hGH has a proline atthe P1' position. Accordingly, substitution of Protropin with either hGHor des-1 hGH in the experimental conditions of Example 4 resulted in noligation (data not shown).

This result was verified using insulin-like growth factor-1 (IGF-1). TheN-terminal sequence of IGF-1 is GPETLC. This proline at position P2' wasexpected to prevent ligation, similar to hGH. The substitution of IGF-1for Protropin in the Example 4 experimental conditions resulted in noligation (data not shown).

These results indicate that the N-terminal sequences of the peptides tobe ligated are important, just as they are for the correspondingsequence in hydrolysis reactions with the wild-type subtilisin.

EXAMPLE 6 Specificity is further defined by the N-terminus

A further level of substrate specificity is obtained due to theimportance of conformation of the N-terminus of the substrate. Inaddition to the amino acid sequence, the N-terminal structure of one ofthe pair of a prospective substrates is important. Subtilisin has beenshown to prefer an extended conformation of the amino acids at thecleavage site. Several acceptor proteins that possess favorable aminoacids sequence failed to be ligated or were ligated in only modestamounts, or required special treatment for ligation to occur, suggestingthat these substrates lack the necessary extended conformation at theN-terminus.

The potential importance of conformation and N-terminal flexibility wasdemonstrated using IGF-1. Two modified forms of IGF-1 were used. TheIGF-1 from brain has 3 residues from the amino terminus removed. Thisform of the protein, des-3 IGF-1, no longer has a proline at the P2'position. Under Example 4 conditions, no ligation of either peptide wasseen to des-3 IGF-1 under these non-reducing conditions. However, bypretreating the substrate with reducing agent (10 mM DTT) for 1 hour,ligation was observed (5%). Another peptide, Long Arg-3 IGF-1, has thefirst 13 amino acid residues of porcine growth hormone (MFPAMPLSSLFVN)attached to the Glu3→Arg variant of IGF-1. The high N-terminal sequencehomology of this protein to Protropin would suggest it can be used as asubstrate. Under Example 4 conditions with the addition of pretreatmentwith 10mM DTT for an hour, some ligation (≈5%) was observed. By adding0.1% SDS and 10% DMSO, and carrying out the reaction at 50° C., theamount of product was increased 2 to 3-fold (data not shown).

Using the hGH receptor in place of Protropin under Example 4experimental conditions resulted in no ligation, even after 5 hours witha 4 fold higher concentration of the ligase (data not shown). While thehGH receptor has the N-terminal sequence FSGSEAT and thus the sequenceconforms to acceptable subtilisin substrate primary sequence, the lackof ligation is attributed to either the lack of accessibility of theN-terminus or a highly structured N-terminus.

A similar result was obtained for Relaxin. Relaxin consists of twochains, A and B. The A chain has a pyroglutamate at the N-terminus, sothe α-amine is incapable of acting as a nucleophile. In contrast, theB-chain has the sequence DSWM and should be a suitable substrate in thepresence of high salt. However, modification of the Example 4experimental conditions to include 2M NaCl in the buffer solutionresulted in no ligation after an hour (data not shown).

These results demonstrate that by judicious selection of reactionconditions (including temperature, detergents, pH and non-aqueoussolvents), it should be possible to selectively alter the conformationof the N-terminal segment of the substrate protein while retaining theligase activity, thus allowing highly specific ligations to occur.

Having described the preferred embodiments of the present invention, itwill appear to those of ordinary skill in the art that variousmodifications may be made and that such modifications are intended to bewithin the scope of the present invention.

All references are expressly incorporated herein by reference.

What is claimed is:
 1. Nucleic acid encoding a subtilisin-type serineprotease variant wherein said variant has an amino acid sequence notfound in nature and is derived from a precursor subtillsin-type serineprotease having an α-helix containing proline at a residue equivalent toproline 225 in Bacillus amyloliquefaciens subtilisin and a catalyticserine at or near the amino terminus of said α-helix equivalent toserine 221 in Bacillus amyloliquefaciens subtilisin, said nucleic acidencoding the:a) replacement of said catalytic serine with a first aminoacid having a different nucleophilic side chain, and b) replacement ofsaid proline with a second different amino acid comprising ahelix-forming amino acid, wherein said serine protease variant encodedby said nucleic acid is characterized by having peptide ligase activityin aqueous solution which is greater than that of said precursor serineprotease variant containing only said substitution of said catalyticserine.
 2. The nucleic acid of claim 1 wherein said precursorsubtilisin-type serine protease is subtilisin.
 3. The nucleic acid ofclaim 2 wherein said catalytic serine comprises serine 221 and saidsecond amino acid residue comprises proline 225 of the amino acidsequence of Bacillus amyloliquefaciens subtilisin.
 4. The nucleic acidof claim 1 wherein said catalytic serine is replaced by cysteine.
 5. Thenucleic acid of claim 1 wherein said helix-forming amino acid amino acidis selected from the group consisting of alanine, leucine, methionine,glutamine, valine and serine.
 6. Expression vector containing thenucleic acid of any of claims 1, 2, 3, 4, or
 5. 7. Host cell transformedwith the expression vector of claim
 6. 8. Nucleic acid encoding asubtilisin-type serine protease variant wherein said variant has anamine acid sequence not found in nature and is derived from a precursorsubtilisin-type serine protease having an α-helix containing proline ata residue equivalent to proline 225 in Bacillus amyloliquefacienssubtilisin and a catalytic serine at or near the amine terminus of saidα-helix equivalent to serine 221 in Bacillus amyloliquefacienssubtilisin, said nucleic acid encoding the:a) replacement of saidcatalytic serine with first amino acid having a different nucleophilicside chain, and b) replacement of said proline with a second differentamino acid having a side chain volume which is less than the side chainvolume of proline, wherein said serine protease variant encoded by saidnucleic acid is characterized by having peptide ligase activity inaqueous solution which is greater than that of said precursor serineprotease variant containing only said substitution of said catalyticserine.
 9. The nucleic acid of claim 8 wherein said precursorsubtilisin-type serine protease is subtilisin.
 10. The nucleic acid ofclaim 9 wherein said catalytic serine comprises serine 221 and saidsecond amino acid residue comprises proline 225 of the amino acidsequence of Bacillus amyloliquefaciens subtilisin.
 11. The nucleic acidof claim 8 wherein said catalytic serine is replaced by cysteine. 12.Expression vector containing the nucleic acid of any of claims 8 through11.
 13. Host cell transformed with the vector of claim
 12. 14. Nucleicacid encoding a subtilisin-type serine protease variant wherein saidvariant has an amino acid sequence not found in nature and is derivedfrom a precursor subtilisin-type serine protease having an α-helixcontaining proline at a residue equivalent to proline 225 in Bacillusamyloliquefaciens subtilisin and a catalytic serine at or near the aminoterminus of said α-helix equivalent to serine 221 in Bacillusamyloliquefaciens subtilisin, said nucleic acid encoding:a) replacementof said catalytic serine with a first amino acid having a differentnucleophilic side chain, and b) replacement of said proline with asecond different amino acid selected from the group consisting ofalanine, lysine, arginine, glutamate, leucine, methionine, glutamine,valine and serine, wherein said serine protease variant encoded by saidnucleic acid is characterized by having peptide ligase activity inaqueous solution which is greater than that of said precursor serineprotease variant containing only said substitution of said catalyticserine.
 15. The nucleic acid of claim 14 wherein said precursorsubtilisin-type serine protease is subtilisin.
 16. The nucleic acid ofclaim 15 wherein said active site serine comprises serine 221 and saidsecond amino acid residue comprises proline 225 of the amino acidsequence of Bacillus amyloliquefaciens subtilisin.
 17. The nucleic acidof claim 14 wherein said catalytic serine is replaced by cysteine. 18.The nucleic acid of claim 14 wherein said second different amino acid isselected from the group consisting of alanine, lysine, arginine orglutamate.
 19. The nucleic acid of claim 14 wherein said seconddifferent amino acid is alanine.
 20. Expression vector containing thenucleic acid of any of claims 14 through
 19. 21. Host cell transformedwith the vector of claim 20.